10 research outputs found

    A federated learning framework for the next-generation machine learning systems

    Get PDF
    Dissertação de mestrado em Engenharia Eletrónica Industrial e Computadores (especialização em Sistemas Embebidos e Computadores)The end of Moore's Law aligned with rising concerns about data privacy is forcing machine learning (ML) to shift from the cloud to the deep edge, near to the data source. In the next-generation ML systems, the inference and part of the training process will be performed right on the edge, while the cloud will be responsible for major ML model updates. This new computing paradigm, referred to by academia and industry researchers as federated learning, alleviates the cloud and network infrastructure while increasing data privacy. Recent advances have made it possible to efficiently execute the inference pass of quantized artificial neural networks on Arm Cortex-M and RISC-V (RV32IMCXpulp) microcontroller units (MCUs). Nevertheless, the training is still confined to the cloud, imposing the transaction of high volumes of private data over a network. To tackle this issue, this MSc thesis makes the first attempt to run a decentralized training in Arm Cortex-M MCUs. To port part of the training process to the deep edge is proposed L-SGD, a lightweight version of the stochastic gradient descent optimized for maximum speed and minimal memory footprint on Arm Cortex-M MCUs. The L-SGD is 16.35x faster than the TensorFlow solution while registering a memory footprint reduction of 13.72%. This comes at the cost of a negligible accuracy drop of only 0.12%. To merge local model updates returned by edge devices this MSc thesis proposes R-FedAvg, an implementation of the FedAvg algorithm that reduces the impact of faulty model updates returned by malicious devices.O fim da Lei de Moore aliado às crescentes preocupações sobre a privacidade dos dados gerou a necessidade de migrar as aplicações de Machine Learning (ML) da cloud para o edge, perto da fonte de dados. Na próxima geração de sistemas ML, a inferência e parte do processo de treino será realizada diretamente no edge, enquanto que a cloud será responsável pelas principais atualizações do modelo ML. Este novo paradigma informático, referido pelos investigadores académicos e industriais como treino federativo, diminui a sobrecarga na cloud e na infraestrutura de rede, ao mesmo tempo que aumenta a privacidade dos dados. Avanços recentes tornaram possível a execução eficiente do processo de inferência de redes neurais artificiais quantificadas em microcontroladores Arm Cortex-M e RISC-V (RV32IMCXpulp). No entanto, o processo de treino continua confinado à cloud, impondo a transação de grandes volumes de dados privados sobre uma rede. Para abordar esta questão, esta dissertação faz a primeira tentativa de realizar um treino descentralizado em microcontroladores Arm Cortex-M. Para migrar parte do processo de treino para o edge é proposto o L-SGD, uma versão lightweight do tradicional método stochastic gradient descent (SGD), otimizada para uma redução de latência do processo de treino e uma redução de recursos de memória nos microcontroladores Arm Cortex-M. O L-SGD é 16,35x mais rápido do que a solução disponibilizada pelo TensorFlow, ao mesmo tempo que regista uma redução de utilização de memória de 13,72%. O custo desta abordagem é desprezível, sendo a perda de accuracy do modelo de apenas 0,12%. Para fundir atualizações de modelos locais devolvidas por dispositivos do edge, é proposto o RFedAvg, uma implementação do algoritmo FedAvg que reduz o impacto de atualizações de modelos não contributivos devolvidos por dispositivos maliciosos

    Train me if you can: decentralized learning on the deep edge

    Get PDF
    The end of Moore’s Law aligned with data privacy concerns is forcing machine learning (ML) to shift from the cloud to the deep edge. In the next-generation ML systems, the inference and part of the training process will perform at the edge, while the cloud stays responsible for major updates. This new computing paradigm, called federated learning (FL), alleviates the cloud and network infrastructure while increasing data privacy. Recent advances empowered the inference pass of quantized artificial neural networks (ANNs) on Arm Cortex-M and RISC-V microcontroller units (MCUs). Nevertheless, the training remains confined to the cloud, imposing the transaction of high volumes of private data over a network and leading to unpredictable delays when ML applications attempt to adapt to adversarial environments. To fill this gap, we make the first attempt to evaluate the feasibility of ANN training in Arm Cortex-M MCUs. From the available optimization algorithms, stochastic gradient descent (SGD) has the best trade-off between accuracy, memory footprint, and latency. However, its original form and the variants available in the literature still do not fit the stringent requirements of Arm Cortex-M MCUs. We propose L-SGD, a lightweight implementation of SGD optimized for maximum speed and minimal memory footprint in this class of MCUs. We developed a floating-point version and another that operates over quantized weights. For a fully-connected ANN trained on the MNIST dataset, L-SGD (float-32) is 4.20× faster than the SGD while requiring only 2.80% of the memory with negligible accuracy loss. Results also show that quantized training is still unfeasible to train an ANN from the scratch but is a lightweight solution to perform minor model fixes and counteract the fairness problem in typical FL systems.This work has been supported by FCT - Fundação para a Ciência e Tecnologia within the R&D Units Project Scope: UIDB/00319/2020. This work has also been supported by FCT within the PhD Scholarship Project Scope: SFRH/BD/146780/2019

    Shifting capsule networks from the cloud to the deep edge

    Get PDF
    Capsule networks (CapsNets) are an emerging trend in image processing. In contrast to a convolutional neural network, CapsNets are not vulnerable to object deformation, as the relative spatial information of the objects is preserved across the network. However, their complexity is mainly related to the capsule structure and the dynamic routing mechanism, which makes it almost unreasonable to deploy a CapsNet, in its original form, in a resource-constrained device powered by a small microcontroller (MCU). In an era where intelligence is rapidly shifting from the cloud to the edge, this high complexity imposes serious challenges to the adoption of CapsNets at the very edge. To tackle this issue, we present an API for the execution of quantized CapsNets in Arm Cortex-M and RISC-V MCUs. Our software kernels extend the Arm CMSIS-NN and RISC-V PULP-NN to support capsule operations with 8-bit integers as operands. Along with it, we propose a framework to perform post-training quantization of a CapsNet. Results show a reduction in memory footprint of almost 75%, with accuracy loss ranging from 0.07% to 0.18%. In terms of throughput, our Arm Cortex-M API enables the execution of primary capsule and capsule layers with medium-sized kernels in just 119.94 and 90.60 milliseconds (ms), respectively (STM32H755ZIT6U, Cortex-M7 @ 480 MHz). For the GAP-8 SoC (RISC-V RV32IMCXpulp @ 170 MHz), the latency drops to 7.02 and 38.03 ms, respectively

    ATLANTIC-PRIMATES: a dataset of communities and occurrences of primates in the Atlantic Forests of South America

    Get PDF
    Primates play an important role in ecosystem functioning and offer critical insights into human evolution, biology, behavior, and emerging infectious diseases. There are 26 primate species in the Atlantic Forests of South America, 19 of them endemic. We compiled a dataset of 5,472 georeferenced locations of 26 native and 1 introduced primate species, as hybrids in the genera Callithrix and Alouatta. The dataset includes 700 primate communities, 8,121 single species occurrences and 714 estimates of primate population sizes, covering most natural forest types of the tropical and subtropical Atlantic Forest of Brazil, Paraguay and Argentina and some other biomes. On average, primate communities of the Atlantic Forest harbor 2 ± 1 species (range = 1–6). However, about 40% of primate communities contain only one species. Alouatta guariba (N = 2,188 records) and Sapajus nigritus (N = 1,127) were the species with the most records. Callicebus barbarabrownae (N = 35), Leontopithecus caissara (N = 38), and Sapajus libidinosus (N = 41) were the species with the least records. Recorded primate densities varied from 0.004 individuals/km 2 (Alouatta guariba at Fragmento do Bugre, Paraná, Brazil) to 400 individuals/km 2 (Alouatta caraya in Santiago, Rio Grande do Sul, Brazil). Our dataset reflects disparity between the numerous primate census conducted in the Atlantic Forest, in contrast to the scarcity of estimates of population sizes and densities. With these data, researchers can develop different macroecological and regional level studies, focusing on communities, populations, species co-occurrence and distribution patterns. Moreover, the data can also be used to assess the consequences of fragmentation, defaunation, and disease outbreaks on different ecological processes, such as trophic cascades, species invasion or extinction, and community dynamics. There are no copyright restrictions. Please cite this Data Paper when the data are used in publications. We also request that researchers and teachers inform us of how they are using the data. © 2018 by the The Authors. Ecology © 2018 The Ecological Society of Americ

    Characterisation of microbial attack on archaeological bone

    Get PDF
    As part of an EU funded project to investigate the factors influencing bone preservation in the archaeological record, more than 250 bones from 41 archaeological sites in five countries spanning four climatic regions were studied for diagenetic alteration. Sites were selected to cover a range of environmental conditions and archaeological contexts. Microscopic and physical (mercury intrusion porosimetry) analyses of these bones revealed that the majority (68%) had suffered microbial attack. Furthermore, significant differences were found between animal and human bone in both the state of preservation and the type of microbial attack present. These differences in preservation might result from differences in early taphonomy of the bones. © 2003 Elsevier Science Ltd. All rights reserved

    NEOTROPICAL ALIEN MAMMALS: a data set of occurrence and abundance of alien mammals in the Neotropics

    No full text
    Biological invasion is one of the main threats to native biodiversity. For a species to become invasive, it must be voluntarily or involuntarily introduced by humans into a nonnative habitat. Mammals were among first taxa to be introduced worldwide for game, meat, and labor, yet the number of species introduced in the Neotropics remains unknown. In this data set, we make available occurrence and abundance data on mammal species that (1) transposed a geographical barrier and (2) were voluntarily or involuntarily introduced by humans into the Neotropics. Our data set is composed of 73,738 historical and current georeferenced records on alien mammal species of which around 96% correspond to occurrence data on 77 species belonging to eight orders and 26 families. Data cover 26 continental countries in the Neotropics, ranging from Mexico and its frontier regions (southern Florida and coastal-central Florida in the southeast United States) to Argentina, Paraguay, Chile, and Uruguay, and the 13 countries of Caribbean islands. Our data set also includes neotropical species (e.g., Callithrix sp., Myocastor coypus, Nasua nasua) considered alien in particular areas of Neotropics. The most numerous species in terms of records are from Bos sp. (n = 37,782), Sus scrofa (n = 6,730), and Canis familiaris (n = 10,084); 17 species were represented by only one record (e.g., Syncerus caffer, Cervus timorensis, Cervus unicolor, Canis latrans). Primates have the highest number of species in the data set (n = 20 species), partly because of uncertainties regarding taxonomic identification of the genera Callithrix, which includes the species Callithrix aurita, Callithrix flaviceps, Callithrix geoffroyi, Callithrix jacchus, Callithrix kuhlii, Callithrix penicillata, and their hybrids. This unique data set will be a valuable source of information on invasion risk assessments, biodiversity redistribution and conservation-related research. There are no copyright restrictions. Please cite this data paper when using the data in publications. We also request that researchers and teachers inform us on how they are using the data

    Global variation in postoperative mortality and complications after cancer surgery: a multicentre, prospective cohort study in 82 countries

    No full text
    © 2021 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY-NC-ND 4.0 licenseBackground: 80% of individuals with cancer will require a surgical procedure, yet little comparative data exist on early outcomes in low-income and middle-income countries (LMICs). We compared postoperative outcomes in breast, colorectal, and gastric cancer surgery in hospitals worldwide, focusing on the effect of disease stage and complications on postoperative mortality. Methods: This was a multicentre, international prospective cohort study of consecutive adult patients undergoing surgery for primary breast, colorectal, or gastric cancer requiring a skin incision done under general or neuraxial anaesthesia. The primary outcome was death or major complication within 30 days of surgery. Multilevel logistic regression determined relationships within three-level nested models of patients within hospitals and countries. Hospital-level infrastructure effects were explored with three-way mediation analyses. This study was registered with ClinicalTrials.gov, NCT03471494. Findings: Between April 1, 2018, and Jan 31, 2019, we enrolled 15 958 patients from 428 hospitals in 82 countries (high income 9106 patients, 31 countries; upper-middle income 2721 patients, 23 countries; or lower-middle income 4131 patients, 28 countries). Patients in LMICs presented with more advanced disease compared with patients in high-income countries. 30-day mortality was higher for gastric cancer in low-income or lower-middle-income countries (adjusted odds ratio 3·72, 95% CI 1·70–8·16) and for colorectal cancer in low-income or lower-middle-income countries (4·59, 2·39–8·80) and upper-middle-income countries (2·06, 1·11–3·83). No difference in 30-day mortality was seen in breast cancer. The proportion of patients who died after a major complication was greatest in low-income or lower-middle-income countries (6·15, 3·26–11·59) and upper-middle-income countries (3·89, 2·08–7·29). Postoperative death after complications was partly explained by patient factors (60%) and partly by hospital or country (40%). The absence of consistently available postoperative care facilities was associated with seven to 10 more deaths per 100 major complications in LMICs. Cancer stage alone explained little of the early variation in mortality or postoperative complications. Interpretation: Higher levels of mortality after cancer surgery in LMICs was not fully explained by later presentation of disease. The capacity to rescue patients from surgical complications is a tangible opportunity for meaningful intervention. Early death after cancer surgery might be reduced by policies focusing on strengthening perioperative care systems to detect and intervene in common complications. Funding: National Institute for Health Research Global Health Research Unit

    Effects of hospital facilities on patient outcomes after cancer surgery: an international, prospective, observational study

    No full text
    © 2022 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 licenseBackground: Early death after cancer surgery is higher in low-income and middle-income countries (LMICs) compared with in high-income countries, yet the impact of facility characteristics on early postoperative outcomes is unknown. The aim of this study was to examine the association between hospital infrastructure, resource availability, and processes on early outcomes after cancer surgery worldwide. Methods: A multimethods analysis was performed as part of the GlobalSurg 3 study—a multicentre, international, prospective cohort study of patients who had surgery for breast, colorectal, or gastric cancer. The primary outcomes were 30-day mortality and 30-day major complication rates. Potentially beneficial hospital facilities were identified by variable selection to select those associated with 30-day mortality. Adjusted outcomes were determined using generalised estimating equations to account for patient characteristics and country-income group, with population stratification by hospital. Findings: Between April 1, 2018, and April 23, 2019, facility-level data were collected for 9685 patients across 238 hospitals in 66 countries (91 hospitals in 20 high-income countries; 57 hospitals in 19 upper-middle-income countries; and 90 hospitals in 27 low-income to lower-middle-income countries). The availability of five hospital facilities was inversely associated with mortality: ultrasound, CT scanner, critical care unit, opioid analgesia, and oncologist. After adjustment for case-mix and country income group, hospitals with three or fewer of these facilities (62 hospitals, 1294 patients) had higher mortality compared with those with four or five (adjusted odds ratio [OR] 3·85 [95% CI 2·58–5·75]; p<0·0001), with excess mortality predominantly explained by a limited capacity to rescue following the development of major complications (63·0% vs 82·7%; OR 0·35 [0·23–0·53]; p<0·0001). Across LMICs, improvements in hospital facilities would prevent one to three deaths for every 100 patients undergoing surgery for cancer. Interpretation: Hospitals with higher levels of infrastructure and resources have better outcomes after cancer surgery, independent of country income. Without urgent strengthening of hospital infrastructure and resources, the reductions in cancer-associated mortality associated with improved access will not be realised. Funding: National Institute for Health and Care Research
    corecore